- 
                Notifications
    You must be signed in to change notification settings 
- Fork 101
Add CSM for batch write flow control #2685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add CSM for batch write flow control #2685
Conversation
2ca884e    to
    b4006a2      
    Compare
  
    | Instant now = Instant.now(); | ||
|  | ||
| if (now.isBefore(nextTime)) { | ||
| bigtableTracer.addBatchWriteFlowControlFactor(cappedFactor, status, false); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check here that bigtableTracer is not null? I don't think you need status, it doesn't mean anything for this metric? Also the QPS is not updated, do we want to update the metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good point, I guess in case context.getTracer() instanceof BigtableTracer is false? How do we deal with it now? Looks like we don't log error?
Recall that status is to record to indicate whether the factor is from the server or due to errors. One theory for the original issue was that a bunch of errors results in min_factor and throttles the client. We'd want to confirm that theory and know which error it is in the future.
We want to report this factor distribution even if it's not applied to target QPS, thus the field "applied" in the metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We skip exporting the tracer if it's not an instance of BigtableTracer ( which should never happen). I think you can add a sanity check for not null.
        
          
                ...rc/main/java/com/google/cloud/bigtable/data/v2/stub/RateLimitingServerStreamingCallable.java
          
            Show resolved
            Hide resolved
        
              
          
                ...rc/main/java/com/google/cloud/bigtable/data/v2/stub/RateLimitingServerStreamingCallable.java
          
            Show resolved
            Hide resolved
        
      | * | ||
| * @param targetQps The new target QPS for the client. | ||
| */ | ||
| public void setBatchWriteFlowControlTargetQps(double targetQps) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@InternalApi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this and not other methods in this class?
| * @param status Status of the request. | ||
| * @param applied Whether the factor was actually applied. | ||
| */ | ||
| public void addBatchWriteFlowControlFactor(double factor, @Nullable Throwable status, boolean applied) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@InternalApi
| Aggregation.sum(), | ||
| InstrumentType.GAUGE, | ||
| "1", | ||
| ImmutableSet.<AttributeKey>builder().addAll(COMMON_ATTRIBUTES).add(STATUS_KEY).build()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is what you defined on the server side? You shouldn't need the status_key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm very strange I didn't recall I added this... removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it should have method_key
        
          
                ...le/src/main/java/com/google/cloud/bigtable/data/v2/stub/metrics/BuiltinMetricsConstants.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      b4006a2    to
    e8a246f      
    Compare
  
    | Instant now = Instant.now(); | ||
|  | ||
| if (now.isBefore(nextTime)) { | ||
| bigtableTracer.addBatchWriteFlowControlFactor(cappedFactor, status, false); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We skip exporting the tracer if it's not an instance of BigtableTracer ( which should never happen). I think you can add a sanity check for not null.
| * | ||
| * @param targetQps The new target QPS for the client. | ||
| */ | ||
| public void setBatchWriteFlowControlTargetQps(double targetQps) {} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add @InternalApi annotation here and the other method.
| static final AttributeKey<String> METHOD_KEY = AttributeKey.stringKey("method"); | ||
| static final AttributeKey<String> STATUS_KEY = AttributeKey.stringKey("status"); | ||
| static final AttributeKey<String> CLIENT_UID_KEY = AttributeKey.stringKey("client_uid"); | ||
| static final AttributeKey<Boolean> APPLIED_KEY = AttributeKey.booleanKey("applied"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe rename it to BATCH_WRITE_QPS_APPLIED_KEY
| Aggregation.sum(), | ||
| InstrumentType.GAUGE, | ||
| "1", | ||
| ImmutableSet.<AttributeKey>builder().addAll(COMMON_ATTRIBUTES).add(STATUS_KEY).build()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still here?
| "1", | ||
| ImmutableSet.<AttributeKey>builder() | ||
| .addAll(COMMON_ATTRIBUTES) | ||
| .add(STREAMING_KEY, STATUS_KEY) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right, this should be APPLIED_KEY, STATUS_KEY and METHOD_KEY
| Aggregation.sum(), | ||
| InstrumentType.GAUGE, | ||
| "1", | ||
| ImmutableSet.<AttributeKey>builder().addAll(COMMON_ATTRIBUTES).add(STATUS_KEY).build()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it should have method_key
| } | ||
|  | ||
| private void updateQps(double factor, Duration period) { | ||
| private void updateQps(double factor, Duration period, @Nullable Throwable t) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of throwable pass in the status. So you don't need to do the null check, you can just have OK status if the update is from a response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But updateQps is used by both onResponseImpl and onErrorImpl
e8a246f    to
    ae0ed5a      
    Compare
  
    
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> ☕️
If you write sample code, please follow the samples format.